aerial robot
SwarmDiffusion: End-To-End Traversability-Guided Diffusion for Embodiment-Agnostic Navigation of Heterogeneous Robots
Zhura, Iana, Karaf, Sausar, Batool, Faryal, Mudalige, Nipun Dhananjaya Weerakkodi, Serpiva, Valerii, Abdulkarim, Ali Alridha, Fedoseev, Aleksey, Seyidov, Didar, Amjad, Hajira, Tsetserukou, Dzmitry
Abstract--Visual traversability estimation is critical for autonomous navigation, but existing VLM-based methods rely on hand-crafted prompts, generalize poorly across embodiments, and output only traversability maps, leaving trajectory generation to slow external planners. We propose SwarmDiffusion, a lightweight end-to-end diffusion model that jointly predicts traversability and generates a feasible trajectory from a single RGB image. T o remove the need for annotated or planner-produced paths, we introduce a planner-free trajectory construction pipeline based on randomized way-point sampling, B ezier smoothing, and regularization enforcing connectivity, safety, directionality, and path thinness. This enables learning stable motion priors without demonstrations. SwarmDiffusion leverages VLM-derived supervision without prompt engineering and conditions the diffusion process on a compact embodiment state, producing physically consistent, traversable paths that transfer across different robot platforms. Across indoor environments and two embodiments (quadruped and aerial), the method achieves 80-100% navigation success and 0.09s inference, and adapts to a new robot using only 500 additional visual samples. ELIABLE indoor navigation is fundamental to a wide range of robotic applications, including warehouse automation [1], industrial inspection [2], search and rescue, and autonomous logistics. In these settings, robots must continuously reason about where they can safely move and how to plan a feasible trajectory through cluttered, unstructured, and dynamic spaces.
Hierarchical Language Models for Semantic Navigation and Manipulation in an Aerial-Ground Robotic System
Liu, Haokun, Ma, Zhaoqi, Li, Yunong, Sugihara, Junichiro, Chen, Yicheng, Li, Jinjie, Zhao, Moju
Heterogeneous multirobot systems show great potential in complex tasks requiring coordinated hybrid cooperation. However, existing methods that rely on static or task-specific models often lack generalizability across diverse tasks and dynamic environments. This highlights the need for generalizable intelligence that can bridge high-level reasoning with low-level execution across heterogeneous agents. To address this, we propose a hierarchical multimodal framework that integrates a prompted large language model (LLM) with a fine-tuned vision-language model (VLM). At the system level, the LLM performs hierarchical task decomposition and constructs a global semantic map, while the VLM provides semantic perception and object localization, where the proposed GridMask significantly enhances the VLM's spatial accuracy for reliable fine-grained manipulation. The aerial robot leverages this global map to generate semantic paths and guide the ground robot's local navigation and manipulation, ensuring robust coordination even in target-absent or ambiguous scenarios. We validate the framework through extensive simulation and real-world experiments on long-horizon object arrangement tasks, demonstrating zero-shot adaptability, robust semantic navigation, and reliable manipulation in dynamic environments. To the best of our knowledge, this work presents the first heterogeneous aerial-ground robotic system that integrates VLM-based perception with LLM-driven reasoning for global high-level task planning and execution.
Online automatic code generation for robot swarms: LLMs and self-organizing hierarchy
Zhu, Weixu, Dorigo, Marco, Heinrich, Mary Katherine
This abstract was accepted to and presented at the "Multi-Agent Cooperative Systems and Swarm Robotics in the Era of Generative AI" (MACRAI) workshop at the 2025 IEEE/RSJ Int. Abstract--Our recently introduced self-organizing nervous system (SoNS) provides robot swarms with 1) ease of behavior design and 2) global estimation of the swarm configuration and its collective environment, facilitating the implementation of online automatic code generation for robot swarms. In a demonstration with 6 real robots and simulation trials with >30 robots, we show that when a SoNS-enhanced robot swarm gets stuck, it can automatically solicit and run code generated by an external LLM on the fly, completing its mission with an 85% success rate. Swarm robotics research has demonstrated that many sophisticated behaviors with a large number of robots can be accomplished in a fully self-organized manner [1], but these fully self-organized behaviors have been slow to transfer to real applications. One reason for this is the fact that robots in a swarm are programmed at the individual level but the desired behavior occurs at the group level, and the design of fully self-organized group behaviors is often analytically intractable [2], [3], requiring extensive trial-and-error testing.
Real-Time Glass Detection and Reprojection using Sensor Fusion Onboard Aerial Robots
Hopkins, Malakhi, Murali, Varun, Kumar, Vijay, Taylor, Camillo J
It verifies that the space around the detected speckle is empty. To do this efficiently, an integral image of the binarized depth map is computed, which allows for rapid, constant-time queries of the pixel sum within any rectangular region. We check the pixel sum in eight rectangular regions surrounding the speckle's bounding box. If the ratio of filled pixels to total pixels within these regions is below a low threshold (e.g., 0.07), the speckle is considered isolated within a glass plane. T emporal Consistency: A final filter operates on a tracking-by-detection principle to ensure identified features are persistent and not transient sensor noise. A speckle is confirmed and passed to the mapping algorithm only after its required count (e.g., 1-3 detections) is exceeded across multiple consecutive frames. To prevent the accumulation of false positives and old detections, a max age parameter is used to expire and remove tracks that have not been seen for a specified duration. D. Transparent Plane Reprojection The final stage of our methodology involves segmenting empty regions in the depth map and reprojecting the confirmed transparent planes. The algorithm first identifies the empty regions in the depth image and applies a non-maximum suppression (NMS) algorithm to merge redundant empty regions, ensuring a single, accurate representation of each transparent plane.
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
Collaborative Exploration with a Marsupial Ground-Aerial Robot Team through Task-Driven Map Compression
Zacharia, Angelos, Dharmadhikari, Mihir, Alexis, Kostas
Abstract--Efficient exploration of unknown environments is crucial for autonomous robots, especially in confined and large-scale scenarios with limited communication. T o address this challenge, we propose a collaborative exploration framework for a marsupial ground-aerial robot team that leverages the complementary capabilities of both platforms. The framework employs a graph-based path planning algorithm to guide exploration and deploy the aerial robot in areas where its expected gain significantly exceeds that of the ground robot, such as large open spaces or regions inaccessible to the ground platform, thereby maximizing coverage and efficiency. T o facilitate large-scale spatial information sharing, we introduce a bandwidth-efficient, task-driven map compression strategy. This method enables each robot to reconstruct resolution-specific volumetric maps while preserving exploration-critical details, even at high compression rates. By selectively compressing and sharing key data, communication overhead is minimized, ensuring effective map integration for collaborative path planning. Simulation and real-world experiments validate the proposed approach, demonstrating its effectiveness in improving exploration efficiency while significantly reducing data transmission.
- Europe > Switzerland (0.04)
- Europe > Norway (0.04)
- Leisure & Entertainment > Games (0.73)
- Aerospace & Defense (0.46)
- Transportation (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Cloud-Assisted Remote Control for Aerial Robots: From Theory to Proof-of-Concept Implementation
Seisa, Achilleas Santi, Sankaranarayanan, Viswa Narayanan, Damigos, Gerasimos, Satpute, Sumeet Gajanan, Nikolakopoulos, George
Cloud robotics has emerged as a promising technology for robotics applications due to its advantages of offloading computationally intensive tasks, facilitating data sharing, and enhancing robot coordination. However, integrating cloud computing with robotics remains a complex challenge due to network latency, security concerns, and the need for efficient resource management. In this work, we present a scalable and intuitive framework for testing cloud and edge robotic systems. The framework consists of two main components enabled by containerized technology: (a) a containerized cloud cluster and (b) the containerized robot simulation environment. The system incorporates two endpoints of a User Datagram Protocol (UDP) tunnel, enabling bidirectional communication between the cloud cluster container and the robot simulation environment, while simulating realistic network conditions. To achieve this, we consider the use case of cloud-assisted remote control for aerial robots, while utilizing Linux-based traffic control to introduce artificial delay and jitter, replicating variable network conditions encountered in practical cloud-robot deployments.
- Europe (0.14)
- North America > United States > Tennessee > Davidson County > Nashville (0.04)
- Research Report (0.64)
- Overview (0.46)
Six-DoF Hand-Based Teleoperation for Omnidirectional Aerial Robots
Li, Jinjie, Li, Jiaxuan, Kaneko, Kotaro, Liu, Haokun, Shu, Liming, Zhao, Moju
Omnidirectional aerial robots offer full 6-DoF independent control over position and orientation, making them popular for aerial manipulation. Although advancements in robotic autonomy, human operation remains essential in complex aerial environments. Existing teleoperation approaches for multirotors fail to fully leverage the additional DoFs provided by omnidirectional rotation. Additionally, the dexterity of human fingers should be exploited for more engaged interaction. In this work, we propose an aerial teleoperation system that brings the rotational flexibility of human hands into the unbounded aerial workspace. Our system includes two motion-tracking marker sets--one on the shoulder and one on the hand--along with a data glove to capture hand gestures. Using these inputs, we design four interaction modes for different tasks, including Spherical Mode and Cartesian Mode for long-range moving, Operation Mode for precise manipulation, as well as Locking Mode for temporary pauses, where the hand gestures are utilized for seamless mode switching. We evaluate our system on a vertically mounted valve-turning task in the real world, demonstrating how each mode contributes to effective aerial manipulation. This interaction framework bridges human dexterity with aerial robotics, paving the way for enhanced aerial teleoperation in unstructured environments.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
- Asia > China > Liaoning Province > Dalian (0.04)
Learning Agile Tensile Perching for Aerial Robots from Demonstrations
Yuan, Kangle, Babgei, Atar, Romanello, Luca, Nguyen, Hai-Nguyen, Clark, Ronald, Kovac, Mirko, Armanini, Sophie F., Kocer, Basaran Bahadir
Perching on structures such as trees, beams, and ledges is essential for extending the endurance of aerial robots by enabling energy conservation in standby or observation modes. A tethered tensile perching mechanism offers a simple, adaptable solution that can be retrofitted to existing robots and accommodates a variety of structure sizes and shapes. However, tethered tensile perching introduces significant modelling challenges which require precise management of aerial robot dynamics, including the cases of tether slack & tension, and momentum transfer. Achieving smooth wrapping and secure anchoring by targeting a specific tether segment adds further complexity. In this work, we present a novel trajectory framework for tethered tensile perching, utilizing reinforcement learning (RL) through the Soft Actor-Critic from Demonstrations (SACfD) algorithm. By incorporating both optimal and suboptimal demonstrations, our approach enhances training efficiency and responsiveness, achieving precise control over position and velocity. This framework enables the aerial robot to accurately target specific tether segments, facilitating reliable wrapping and secure anchoring. We validate our framework through extensive simulation and real-world experiments, and demonstrate effectiveness in achieving agile and reliable trajectory generation for tensile perching.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Switzerland (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (2 more...)
- Energy (0.68)
- Transportation > Air (0.46)
- Aerospace & Defense (0.46)
Multi-robot Aerial Soft Manipulator For Floating Litter Collection
González-Morgado, Antonio, Smits, Sander, Heredia, Guillermo, Ollero, Anibal, Krupa, Alexandre, Chaumette, François, Spindler, Fabien, Franchi, Antonio, Gabellieri, Chiara
--Removing floating litter from water bodies is crucial to preserving aquatic ecosystems and preventing environmental pollution. In this work, we present a multi-robot aerial soft manipulator for floating litter collection, leveraging the capabilities of aerial robots. The proposed system consists of two aerial robots connected by a flexible rope manipulator, which collects floating litter using a hook-based tool. Compared to single-aerial-robot solutions, the use of two aerial robots increases payload capacity and flight endurance while reducing the downwash effect at the manipulation point, located at the midpoint of the rope. Additionally, we employ an optimization-based rope-shape planner to compute the desired rope shape. The planner incorporates an adaptive behavior that maximizes grasping capabilities near the litter while minimizing rope tension when farther away. The computed rope shape trajectory is controlled by a shape visual servoing controller, which approximates the rope as a parabola. The complete system is validated in outdoor experiments, demonstrating successful grasping operations. An ablation study highlights how the planner's adaptive mechanism improves the success rate of the operation. Furthermore, real-world tests in a water channel confirm the effectiveness of our system in floating litter collection. These results demonstrate the potential of aerial robots for autonomous litter removal in aquatic environments. In 2019, 353 million tonnes of plastic waste was generated, only 9% of which was recycled, while 22% was mismanaged, with a considerable portion of it ending up in water. The best solutions to plastic pollution include preventing it from entering the environment, e.g., by limiting single-use plastic, and improving plastic management [1].
- North America > Costa Rica > Heredia Province > Heredia (0.04)
- Europe > Spain > Andalusia > Seville Province > Seville (0.04)
- Europe > Netherlands (0.04)
- (3 more...)
A Hierarchical Graph-Based Terrain-Aware Autonomous Navigation Approach for Complementary Multimodal Ground-Aerial Exploration
Patel, Akash, Saucedo, Mario A. V., Stathoulopoulos, Nikolaos, Sankaranarayanan, Viswa Narayanan, Tevetzidis, Ilias, Kanellakis, Christoforos, Nikolakopoulos, George
Autonomous navigation in unknown environments is a fundamental challenge in robotics, particularly in coordinating ground and aerial robots to maximize exploration efficiency. This paper presents a novel approach that utilizes a hierarchical graph to represent the environment, encoding both geometric and semantic traversability. The framework enables the robots to compute a shared confidence metric, which helps the ground robot assess terrain and determine when deploying the aerial robot will extend exploration. The robot's confidence in traversing a path is based on factors such as predicted volumetric gain, path traversability, and collision risk. A hierarchy of graphs is used to maintain an efficient representation of traversability and frontier information through multi-resolution maps. Evaluated in a real subterranean exploration scenario, the approach allows the ground robot to autonomously identify zones that are no longer traversable but suitable for aerial deployment. By leveraging this hierarchical structure, the ground robot can selectively share graph information on confidence-assessed frontier targets from parts of the scene, enabling the aerial robot to navigate beyond obstacles and continue exploration.
- Europe > Sweden > Norrbotten County > Luleå (0.04)
- Europe > Montenegro (0.04)